Goto

Collaborating Authors

 mathematical formula


Advancing network resilience theories with symbolized reinforcement learning

Zheng, Yu, Ding, Jingtao, Jin, Depeng, Gao, Jianxi, Li, Yong

arXiv.org Artificial Intelligence

Many complex networks display remarkable resilience under external perturbations, internal failures and environmental changes, yet they can swiftly deteriorate into dysfunction upon the removal of a few keystone nodes. Discovering theories that measure network resilience offers the potential to prevent catastrophic collapses--from species extinctions to financial crise--with profound implications for real-world systems. Current resilience theories address the problem from a single perspective of topology, neglecting the crucial role of system dynamics, due to the intrinsic complexity of the coupling between topology and dynamics which exceeds the capabilities of human analytical methods. Here, we report an automatic method for resilience theory discovery, which learns from how AI solves a complicated network dismantling problem and symbolizes its network attack strategies into theoretical formulas. This proposed self-inductive approach discovers the first resilience theory that accounts for both topology and dynamics, highlighting how the correlation between node degree and state shapes overall network resilience, and offering insights for designing early warning signals of systematic collapses. Additionally, our approach discovers formulas that refine existing well-established resilience theories with over 37.5% improvement in accuracy, significantly advancing human understanding of complex networks with AI.


MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training

Drechsel, Jonathan, Reusch, Anja, Herbold, Steffen

arXiv.org Artificial Intelligence

Mathematical formulas are a fundamental and widely used component in various scientific fields, serving as a universal language for expressing complex concepts and relationships. While state-of-the-art transformer models excel in processing and understanding natural language, they encounter challenges with mathematical notation, which involves a complex structure and diverse representations. This study focuses on the development of specialized training datasets to enhance the encoding of mathematical content. We introduce Math Mutator (MAMUT), a framework capable of generating equivalent and falsified versions of a given mathematical formula in LaTeX notation, effectively capturing the mathematical variety in notation of the same concept. Based on MAMUT, we have generated four large mathematical datasets containing diverse notation, which can be used to train language models with enhanced mathematical embeddings.


Trainable Adaptive Activation Function Structure (TAAFS) Enhances Neural Network Force Field Performance with Only Dozens of Additional Parameters

Li, Enji

arXiv.org Artificial Intelligence

At the heart of neural network force fields (NNFFs) is the architecture of neural networks, where the capacity to model complex interactions is typically enhanced through widening or deepening multilayer perceptrons (MLPs) or by increasing layers of graph neural networks (GNNs). These enhancements, while improving the model's performance, often come at the cost of a substantial increase in the number of parameters. By applying the Trainable Adaptive Activation Function Structure (TAAFS), we introduce a method that selects distinct mathematical formulations for non-linear activations, thereby increasing the precision of NNFFs with an insignificant addition to the parameter count. In this study, we integrate TAAFS into a variety of neural network models, resulting in observed accuracy improvements, and further validate these enhancements through molecular dynamics (MD) simulations using DeepMD.


RedStone: Curating General, Code, Math, and QA Data for Large Language Models

Chang, Yaoyao, Cui, Lei, Dong, Li, Huang, Shaohan, Huang, Yangyu, Huang, Yupan, Li, Scarlett, Lv, Tengchao, Ma, Shuming, Sun, Qinzheng, Wang, Wenhui, Wei, Furu, Xin, Ying, Yang, Mao, Yin, Qiufeng, Zhang, Xingxing

arXiv.org Artificial Intelligence

Pre-training Large Language Models (LLMs) on high-quality, meticulously curated datasets is widely recognized as critical for enhancing their performance and generalization capabilities. This study explores the untapped potential of Common Crawl as a comprehensive and flexible resource for pre-training LLMs, addressing both general-purpose language understanding and specialized domain knowledge. We introduce RedStone, an innovative and scalable pipeline engineered to extract and process data from Common Crawl, facilitating the creation of extensive and varied pre-training datasets. Unlike traditional datasets, which often require expensive curation and domain-specific expertise, RedStone leverages the breadth of Common Crawl to deliver datasets tailored to a wide array of domains. In this work, we exemplify its capability by constructing pre-training datasets across multiple fields, including general language understanding, code, mathematics, and question-answering tasks. The flexibility of RedStone allows for easy adaptation to other specialized domains, significantly lowering the barrier to creating valuable domain-specific datasets. Our findings demonstrate that Common Crawl, when harnessed through effective pipelines like RedStone, can serve as a rich, renewable source of pre-training data, unlocking new avenues for domain adaptation and knowledge discovery in LLMs. This work also underscores the importance of innovative data acquisition strategies and highlights the role of web-scale data as a powerful resource in the continued evolution of LLMs. RedStone code and data samples will be publicly available at \url{https://aka.ms/redstone}.


Image-to-LaTeX Converter for Mathematical Formulas and Text

Gurgurov, Daniil, Morshnev, Aleksey

arXiv.org Artificial Intelligence

In this project, we train a vision encoder-decoder model to generate LaTeX code from images of mathematical formulas and text. Utilizing a diverse collection of image-to-LaTeX data, we build two models: a base model with a Swin Transformer encoder and a GPT-2 decoder, trained on machine-generated images, and a fine-tuned version enhanced with Low-Rank Adaptation (LoRA) trained on handwritten formulas. We then compare the BLEU performance of our specialized model on a handwritten test set with other similar models, such as Pix2Text, TexTeller, and Sumen. Through this project, we contribute open-source models for converting images to LaTeX and provide from-scratch code for building these models with distributed training and GPU optimizations.


Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

Petruzzellis, Flavio, Testolin, Alberto, Sperduti, Alessandro

arXiv.org Artificial Intelligence

Large Language Models (LLMs) achieve impressive performance in a wide range of tasks, even if they are often trained with the only objective of chatting fluently with users. Among other skills, LLMs show emergent abilities in mathematical reasoning benchmarks, which can be elicited with appropriate prompting methods. In this work, we systematically investigate the capabilities and limitations of popular open-source LLMs on different symbolic reasoning tasks. We evaluate three models of the Llama 2 family on two datasets that require solving mathematical formulas of varying degrees of difficulty. We test a generalist LLM (Llama 2 Chat) as well as two fine-tuned versions of Llama 2 (MAmmoTH and MetaMath) specifically designed to tackle mathematical problems. We observe that both increasing the scale of the model and fine-tuning it on relevant tasks lead to significant performance gains. Furthermore, using fine-grained evaluation measures, we find that such performance gains are mostly observed with mathematical formulas of low complexity, which nevertheless often remain challenging even for the largest fine-tuned models.


What is Data Science??

#artificialintelligence

As the definition says, "Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains". In this article we will discuss more about Machine Learning. So, suppose we have gone to a university just to do some research on how students of that particular university are getting marks according to the number of hours they study. NOTE: This is a hypothetical data not a real world data. This data says that the student who is studying 3 hours is getting 30 marks and the students who is studying 8 hours is getting 80 marks.


How to Learn Machine Learning from Scratch - Machine Learning Specialist- Emirhan BULUT

#artificialintelligence

There is indeed a lot of information and academic data on the internet in the field of Machine Learning. Entropy used in decision trees, gini; Many mathematical formulas, such as the Euclidean neighbor relation used in KNN, are available on the internet on the official description page of libraries or on mathematical websites. You will not need any university or course for this. As you know, all models and algorithms used in Machine Learning are based on mathematics. As a matter of fact, Machine Learning is the autonomous form of mathematics taught to the machine. There are several advantages of using mathematics in this field.


MathBERT: A Pre-Trained Model for Mathematical Formula Understanding

Peng, Shuai, Yuan, Ke, Gao, Liangcai, Tang, Zhi

arXiv.org Artificial Intelligence

Large-scale pre-trained models like BERT, have obtained a great success in various Natural Language Processing (NLP) tasks, while it is still a challenge to adapt them to the math-related tasks. Current pre-trained models neglect the structural features and the semantic correspondence between formula and its context. To address these issues, we propose a novel pre-trained model, namely \textbf{MathBERT}, which is jointly trained with mathematical formulas and their corresponding contexts. In addition, in order to further capture the semantic-level structural features of formulas, a new pre-training task is designed to predict the masked formula substructures extracted from the Operator Tree (OPT), which is the semantic structural representation of formulas. We conduct various experiments on three downstream tasks to evaluate the performance of MathBERT, including mathematical information retrieval, formula topic classification and formula headline generation. Experimental results demonstrate that MathBERT significantly outperforms existing methods on all those three tasks. Moreover, we qualitatively show that this pre-trained model effectively captures the semantic-level structural information of formulas. To the best of our knowledge, MathBERT is the first pre-trained model for mathematical formula understanding.


NLP using Deep Learning Tutorials: Understand the Activation Function

#artificialintelligence

This article is a first of a series that I'm writing, and where I will try to address the topic of using Deep Learning in NLP. First of all, I was writing an article for an example of text classification using a perceptron, but I was thinking that will be better to review some basics before, as activation and loss functions. Activation functions are introduced in neural networks to capture complex relationships in data. There are nonlinear functions added in general at the end of a network to transform data from complex to a simple format, which makes it easy to be interpreted depending on the main purpose of the model. There are many types of activation functions.